7 research outputs found

    A new approach to in silico SNP detection and some new SNPs in the Bacillus anthracis genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>Bacillus anthracis </it>is one of the most monomorphic pathogens known. Identification of polymorphisms in its genome is essential for taxonomic classification, for determination of recent evolutionary changes, and for evaluation of pathogenic potency.</p> <p>Findings</p> <p>In this work three strains of the <it>Bacillus anthracis </it>genome are compared and previously unpublished single nucleotide polymorphisms (SNPs) are revealed. Moreover, it is shown that, despite the highly monomorphic nature of <it>Bacillus anthracis</it>, the SNPs are (1) abundant in the genome and (2) distributed relatively uniformly across the sequence.</p> <p>Conclusions</p> <p>The findings support the proposition that SNPs, together with indels and variable number tandem repeats (VNTRs), can be used effectively not only for the differentiation of perfect strain data, but also for the comparison of moderately incomplete, noisy and, in some cases, unknown <it>Bacillus anthracis </it>strains. In the case when the data is of still lower quality, a new DNA sequence fingerprinting approach based on recently introduced markers, based on combinatorial-analytic concepts and called cyclic difference sets, can be used.</p

    An output-sensitive algorithm for the minimization of 2-dimensional String Covers

    Full text link
    String covers are a powerful tool for analyzing the quasi-periodicity of 1-dimensional data and find applications in automata theory, computational biology, coding and the analysis of transactional data. A \emph{cover} of a string TT is a string CC for which every letter of TT lies within some occurrence of CC. String covers have been generalized in many ways, leading to \emph{k-covers}, \emph{λ\lambda-covers}, \emph{approximate covers} and were studied in different contexts such as \emph{indeterminate strings}. In this paper we generalize string covers to the context of 2-dimensional data, such as images. We show how they can be used for the extraction of textures from images and identification of primitive cells in lattice data. This has interesting applications in image compression, procedural terrain generation and crystallography

    Hierarchical structure of cascade of primary and secondary periodicities in Fourier power spectrum of alphoid higher order repeats

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Identification of approximate tandem repeats is an important task of broad significance and still remains a challenging problem of computational genomics. Often there is no single best approach to periodicity detection and a combination of different methods may improve the prediction accuracy. Discrete Fourier transform (DFT) has been extensively used to study primary periodicities in DNA sequences. Here we investigate the application of DFT method to identify and study alphoid higher order repeats.</p> <p>Results</p> <p>We used method based on DFT with mapping of symbolic into numerical sequence to identify and study alphoid higher order repeats (HOR). For HORs the power spectrum shows equidistant frequency pattern, with characteristic two-level hierarchical organization as signature of HOR. Our case study was the 16 mer HOR tandem in AC017075.8 from human chromosome 7. Very long array of equidistant peaks at multiple frequencies (more than a thousand higher harmonics) is based on fundamental frequency of 16 mer HOR. Pronounced subset of equidistant peaks is based on multiples of the fundamental HOR frequency (multiplication factor <it>n </it>for <it>n</it>mer) and higher harmonics. In general, <it>n</it>mer HOR-pattern contains equidistant secondary periodicity peaks, having a pronounced subset of equidistant primary periodicity peaks. This hierarchical pattern as signature for HOR detection is robust with respect to monomer insertions and deletions, random sequence insertions etc. For a monomeric alphoid sequence only primary periodicity peaks are present. The 1/<it>f</it><sup><it>β </it></sup>– noise and periodicity three pattern are missing from power spectra in alphoid regions, in accordance with expectations.</p> <p>Conclusion</p> <p>DFT provides a robust detection method for higher order periodicity. Easily recognizable HOR power spectrum is characterized by hierarchical two-level equidistant pattern: higher harmonics of the fundamental HOR-frequency (secondary periodicity) and a subset of pronounced peaks corresponding to constituent monomers (primary periodicity). The number of lower frequency peaks (secondary periodicity) below the frequency of the first primary periodicity peak reveals the size of <it>n</it>mer HOR, i.e., the number <it>n </it>of monomers contained in consensus HOR.</p
    corecore